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Abstract 

We calculate the probability p c that the maximum of a reflected Brownian motion U is 
achieved on a complete excursion, i.e. p c := P(U(t) = where U(t) (respectively 

U*(t)) is the maximum of the process U over the time interval [0,£] (resp. [0,gr(t)] 
where g(t) is the last zero of U before t). 
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1 Introduction 

1.1 Motivation. The local score of a biological sequence is its "best" segment with 
respect to some scoring scheme (see e.g. 11121 for more details) and the knowledge of its 
distribution is important (see e.g. (8), (9J). Let us briefly recall the mathematical setting 
while biological interpretations can be found in (5). Let S n := ei + ■ • • + e„ be the random 
walk generated by the sequence of the independent and identically distributed random 
variables (e*, i G IN) that are centered with unit variance. The local score is the process: 
U n := S n - min Si, where n ^ 0. The path of (U n , n £ IN) is a succession of 0 and 

excursions above 0. In (5j, the authors only took into consideration complete excursions 
up to a fixed time n and so considered the maximum U* of the heights of all the complete 
excursions up to time n instead of the maximum U n of the path until time n. They also 
introduced the random time 9* of the length of the segment that realizes U*. Since it is 
easy to simulate (S k , 0 < k < n), for any n not too large, we get an approximation of the 
law of ([/*,$*) for a given n. Simulations have shown that for an important proportion of 
sequences, U n is realized during the last incomplete excursion. As expected, the number 
of excursions naturally increases when the length of the sequence growths, however the 
proportion of sequences that achieve their maximum on a complete excursion remains 
strikingly constant. The main goal of this study is to explain these observations and to 
calculate this probability when n is large, see Proposition 1 1.2 1 below. 
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1.2 Link with the Brownian motion. According to the functional convergence theorem 
of Donsker, the random walk (£4, 0 < k < n) (resp. (£4, 0 < k ^ n)) normalized by the 
factor 1 jy/n converges in distribution, as n —> oo, to the Brownian motion (resp. the 
reflected Brownian motion). We prove (see Theorem |1.1| for a precise formulation) that 
the probability that the maximum of a reflected Brownian motion over a finite interval 
[0, t) is achieved on a complete excursion is around 30% and is thus independent of f. 
This result permits to answer to the two questions asked in the discrete setting, when n 
is large. 

Let U be the reflected Brownian motion started at 0, i.e. U t = \B t \ where (B t )t is the 
standard one-dimensional Brownian motion started at 0. In Chabriac et a 1. 1(5)1 . the 
authors have considered two maxima: (7(f) and U*(t), the first (resp. second) one being 
the maximum of U up to time f (resp. the last zero before f), namely (7(f) := max U s and 

U*(t) = £/(g(f)), where g(t) := sup{s < t, U(s) = 0}. In the density function of the 
pair (U* (f ), 9* (f)) has been calculated where 6*{t) is the first hitting time of level U*{t) 
by the process (U s , 0 ^ s < g{t)). Here we only deal with U*(t) and (7(f). 

It is clear that (7(f) = U*(t ) if and only if U**(t) < (7(f), where 

U**(t) := max U(s). (1.1) 

In that case, the maximum of U over [0, t] is the maximum of all the complete excursions 
of U which hold before t. We introduce the probability p c that the maximum of U over 
[0, f] is achieved on a complete excursion: 

p c = P (17(f) = U*(t)) = P (17* (f) > U**(t)), (1.2) 

Let ip be the logarithmic derivative of the Gamma function: 

iP(x):=r'{x)/T{x). (1.3) 

The main result of our study is 

Theorem 1.1. The probability p c equals ip (1/4) — ip (1/2) + 1 + n/2 « 0.3069. 

1.3 Back to discrete sequences. We now go back to the setting of random walks 
introduced in paragraph 1.1. Lei; p ( n] be the probability that the maximum of ((4.0 7 
k < n) is achieved on a complete excursion, namely 

P( max Uk = max Uk) where q n := maxjfc ^ n, Uk = 01. (1.4) 

(n) 

Proposition 1.2. p K c ' converges to p c as n —> oo. 

The convergence of pi' 1 ' 1 can be obtained from Theorem 3.3 in (5J and the fact that the 
event N defined by (4.23) in |0 is actually included in j max 0 ^^„ Uk = max 0 ^fc^ g „ £4 j. 

1.4 Main steps of the proof. We now consider the Brownian motion setting. The 
density function of U{t) is known (see either Subsection 2.11 in [O or Lemma 3.2 in 
HU) and the one of U*(t) has been calculated in 10. Obviously, the knowledge of 
the distributions of (7(f) and U*{t) is not sufficient to determine p c . The trajectory 
of (E/ s ,0 < s < f) naturally splits in two parts before and after the random time g(t) 
which is not a stopping time. Although (U S1 0 ^ s ^ g(t)) and ( U s ,g(t ) < s < f) are not 
independent, the scaling with g(t) leads to independence. Indeed, 


( 1 


VvW 


U(sg(t )), 0 ^ s ^ 1 


V* ~ 9(t) 


I B{g(t) + s(f - g(t ))\, 0 < s < 1 , g(t) 


(1.5) 
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are independent. Moreover each part of the above triplet has a known distribution. The 
process (g(t)~ 1 ^ 2 B(g(t)s), 0 < s < l) is distributed as the Brownian bridge (b(s), 0 < 
s < l), (see e.g. (Tl) and the second component in (1.5| l is the Brownian meander 
denoted m. The scaling property of the Brownian motion implies that g(t) is distributed 
as tg( 1) while the distribution of g( 1) is the arcsine one (see again IfTTH: 

P(ff(l) e dx) = — 1 = 11 [ 0 , 1 ] {x) dx. (1.6) 

TT\/x( 1 — X) 


Consequently, 


(U*(t),U**(t))= (^/tg(l)b*,^/t(l-g(l)) max m{u)\ (1.7) 

\ O^n^l J 

where b* := sup |b(s)|. Its distribution function is given by the Kolmogorov-Smirnov 
formula (see e.g. ifTOl): 

F{b* >x) =2^(-l) fe - 1 e" 2fe2x2 , z>0. (1.8) 

k^l 

Formula O permits to determine the law of ( U*(t ), U**(t )), once we know the distri¬ 
bution of max m(u). But by O, for any bounded Borel function /, 


E[/(m(«), 0 < u < 1)] = 


E 


R( i) 


1) 


(1.9) 


where (f?(w), 0 ^ u < 1) stands for a 3-dimensional Bessel started at 0. 

Due to the scaling property O* we deduce that p c does not depend on t and 


Pc = \l 2 E 


F b* 


1 9(1) 

1-5(1) 


where 


F(x) := E 


i?(l) { max o^u^i ii(«)<x} 


( 1 . 10 ) 


( 1 . 11 ) 


According to ( |1.10p we have first to determine the function F (see Lemma|2.1|below), sec¬ 


ond the distribution function of b* 


5(1) 


1-5(1) 

the expectation. The details are given in Section [2] 

2 Proof of Theorem 11.11 

Lemma 2.1. For any x > 0, 


(see Proposition 


2.2 1 and third to calculate 




D -2 k 2 x 2 _ „-(2fc+l) 2 z 2 /2l _ 






exp 


Proof. First, by {4J formula 1.1.8, p317]. 


P ( max R(u) < y, R( 1) £ dz R( 0) = x ) = 




k>0 




(2 k + 1) 2 tt 2 
2x 2 


( 2 . 1 ) 


X S' X 11 {y> x ,z<y}dz 


( 2 . 2 ) 
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where 


* = £ 




+ (z + x + 2ky? 


A Taylor expansion of x exp j — ( z ± x +2ky ) j at x _ q leads to 


P ( max^R(u) < y, R( 1) £ dz R( 0) = Oj = 


'y^ (z + 2ky) exp < — 


.fcez 


(z + 2 ky) 2 


H{ z< y }dz. 


As a consequence. 


= r(»+2%)ex P {-h±|h>i w, 

v 2 71 " tez 1 ' 0 ( 




£ 

fcez 


exp {—2fc 2 ?/ 2 } — exp < — 


(2fc + l) 2 y 2 


The second equality in ( |2. 1 [ > is a direct consequence of the Poisson summation formula 
(see, e.g. (7J Chap. XIX p.630]: 


1 

\/2nt 


£ exp { _ ^(^o + 2M/3o-ao)) 2 } - exp | - ^(2/3 0 - x 0 + 2fc(/3 0 - a 0 )) 2 } 

feez 


1 


= —!—V 
Po — a 0 2 “"' 


k> 1 


kirxo \ f kn(2po — Xo)\\ [ k 2 n 2 t 

cos | —- j — cos [ ---- ) j exp < — - 


Po — OL 0 


V Po - «0 


2(po - ao) 2 J. 


applied with f = 1, x 0 = 0, Po = x/2 and a 0 = —x/2. 

Proposition 2.2. For any x > 0, 

P (b\ L g(1 L > x I = - 


1-5(1) 


) = - rme- 2 *'^ 

J 7T J 0 yfu 


where 


A(u) := £(-l) 


fc-l 


fc>i 


k 2 + u 


□ 


(2.3) 


(2.4) 


Proof. We introduce a cut-off 0 < e < 1 and we define: 


y e {x) := P |s(l) < 1 - e, fe*,y/ x 


> x 


(2.5) 


Using the independence between g(l) and b* , ( |1.6| l and l |1.8| l, we deduce: 

^y p l 5 * >X \/ L j l ) dy = ^ £ Jfc(e) 


i r i 

Mx) = - / 7^ 

vW y , v - - / ■■ fc^i 

, , , , , fc _i f 1-6 1 f 2fc 2 x 2 (l — y) 

where I k (e) := (—lp / J 

Jo 


. _: exp <- : -— > dy. The inversion of the 

>0 yjy( 1 - y) l y J 
sum and the integral is available since (1 — y)/y ^ s' >0 where s' := e/(l — e). Making 
the change of variables (1 — y)/y = u/k 2 leads to: 


4(e) = 


(-D 


fc—l r°° 


k J e , k 2 y/u{l +u/k 2 ) 


exp {—2x 2 rt} du. 
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1 


The identity , 

1 + u/k 2 

Finally we get: 


= 1 - 


k 2 { 1 + u/k 2 ) 


allows to invert the sum and the integral. 


, A 2 f °° ( ST ' (- 1 )*- 1 1 

*«(*) = . i L t- 


'fc 2 < u 


r „ 2 , rfu 2 f 00 . exp {—2a: 2 rt} 

t i+/ exp " } ^; L — 


with n{e',u) = \_y/u/e'\, S n {u) := ^(-l) fc V(l/fe,u) and ^(y,u) := y/(l + m/ 2 ). 


Note that 


d<j) 

dy 


k =1 


^ 1, then, considering n = 2m and n = 2m + 1 and using the mean value 


inequality we obtain: 


\S2m(u)\ = 


E< 

*:'=i 


i 


2fc' — 1 


2k' 


,u 




E 

fc'=i 


i 


i 


2k' - 1 2fc' 


< - 

^ 2k'{2k' - 1) 


< oo. 


( 2 . 6 ) 

Similarly, |5 2m +i(w)| < oo. Since e' — > 0 and n{e',u) —> 0 as e goes to 0, then, identity 
( |2.3[ > is a direct consequence of the Lebesgue dominated convergence theorem. P 

Lemma [2TT] and Proposition [2~2] allow to obtain a new integral form for p c . 

Lemma 2.3. One has 


ro ° uA{u 2 ) 


Pc = 8 . T 77) - \ du , 

J g smh(27ru) 


Proof. We deduce easily from \2A\ that 


A(u) = Y, 


k‘> 1 


1 


2k' — 1 


2k' ’ 


where 4>{y,u) := y/{ 1 + uy 2 ). Then inequality ( |2.6| l implies that sup M ^ 0 A(u) < oo. By 
Lemma [2T~| Proposition [272] the definition ( |2.4| l of A and the Fubini theorem, we get 

^ = /l?E ./ 0 ^ (u) (i exp {" f2fc ^ } ^ 

But making s = 2x 2 u and letting 2 = 2(2 k + l)7r y/u, we get: 


ex P \ - 


(2fc+l) 2 7T 2 


2a: 2 


1 1 / z \!/ 2 

- 2 xn r = 7 ^(. 2 > K '" (z) 


where (cf lTT3l Formula (15) pl83]) 

K " {x) = \iif l 

Recall that K u = K_ u fl3l Formula (8) p79] and K_ 1 / 2 {x) = Formula (13) 

p80] . It follows that 


Pc = 8 E [ A{u)e~ 2{2k+l)v ^du = 4 [ 


A(u) 


0 sinh (27 Ty/u) 


du = 8 


r°° vA(v 2 ) 

0 sinh ( 2nv ) 


dv. 


O 
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We now focus on the function A. Our method is based on the crucial fact that A can be 
expressed with the function xp defined by Q- 

Lemma 2.4. 1. We have: 


A(u) = 




2. There exists a, b > 0 such that 

\ip(z)\ ^ a + b\z\ 2 , V 2 € C, | Im z\ ^ 1. 
Consequently ; p c = h - h, where 

vF k (v) 


h '■= 2 


-dv, k = 1,2 


u ^ 0. 

(2.7) 

( 2 . 8 ) 

(2.9) 


/ o sinh(27rv) 

and F^v) := ^ (^), F 2 (v) := ^ (f) + V> (-f )■ 

Proof Formula \2.1\ and inequality ( |2.8| l) are a direct consequence of identity (3), p 15 
in ©, i.e. 


ip(z) = -T- z + 


z 4^ n(z + n) 


^ z n 2 )~ n 2 (z + n) 

n >1 n>1 v y 


( 2 . 10 ) 


with 7 := lim (-— lnm |. 

m—>oo V z —*■ fc 


□ 


fc=l 


Due to the form of the functions F\ and F 2 , the integrals 1\ and I 2 can be viewed as 
integrals over a straight line in the plane. More precisely we have: 

Lemma 2.5. I± and I 2 can be written as: 


h = -8i 


[ T y 2 M z)dz, I 2 = -8i j 

J a 1/2 sm (4 tt2) J A 


ZXp(z + 1) 
A 0 sin (471-2) 


dz 


( 2 . 11 ) 


where, for any a G R, A„ is the line with parametrization by z = a + it, t £ R. 
Proof 1) We note that sin(27r?'n) = i sinh(27rn). Thus 


/ o sinh(27rn) 


4 


i Jo sin (47ri ^ V 2 

2 - 1/2 


= —8 i 


I a/ sin( 47 T 2 ) 
^ 1/2 V ' 


xp ( 2 ) dz 


where A/ is the half-line: z = a + it, t ^ 0. Similarly 
2 


v , ( l-iv\ , 0 f z — 1/2 . . 

xp ( —-— ) dv = /A _ ^ ip ( 2 ) dz 


J 0 sinh(27n;) \ 2 


A // sin (47T2) 


1/2 


where A/ := {2 = a + it, t ^ 0} and a G R. This implies the value of I\ given by \2Al\ . 
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2) Formula ( |2. 1 OP tells us that ip{z) := tp(z) + \ has no singularity at z = 0. Thus we 
study 


Jo smh (27 to) V V 2 ) i Jo sin (47r^) \ 2 / V 2 


=-8 i — rj}{z)dz 

J a' sm (47T^J 


and similarly 
2 


v 7 / iv\ , iv 7 (iv , , 

[ 0 -hi, 


- 8 * 


A" sin (4 tt2) 


ip (z) dz. 


The identity (formula (8) p 16 in (BJ) : 


implies ip 




ip{z) = ip(z) + 1 
z 



1 p{z + 1 ) 

) and finally \2.\l\ . 


( 2 . 12 ) 

O 


We show in the following lemma that /i and I 2 are integrals over the vertical line. 
Lemma 2.6. Let 0 < e < 1/4, then 


h = -8 * 



2 - 1/2 
sin (4nz) 


ip{z)dz, 



sm (47 tz) 

(2.13) 


Proof. We only deal with / 2 , the proof related to Ji is similar and easier. 

The quantity sin(47rz) cancels at z = k/ 4 for every teZ, the zeros are simple. From 
( f 2 TIOl >, we deduce that h(z) := zip(z+ 1)/sin (47 tz) is meromorphic in {z £ C; -1/4 < Rez < 3/4} 
with poles at 1/4 and 1/2. We introduce the contour defined in Figure [2] 


A 


1/2 1/2+e 3/4 


Then the residue theorem gives 


[ h(z)dz — I h(z)dz + f h(z)dz — f h(z)dz = 2in {Res (h, 1/4) + Res (h, 1/2)} . 
JC n B n JD„A„ JD n C n 

(2.14) 


The residual at 1/4 is given by 


Res ( h , 1/4) 


lim 

2—5-1/4 


[h{z){z- 1/4)} 


V* (5/4) ( 

\ ; 


lim 


- 1/4 


1/4 sin (47rz) — sin (47rl/4) 


1 

167T 


iP (5/4) 


Using < |2 .1 2P with z = 1/4, we get Res (h, 1/4) = — J- ^1 + J%p (1/4)^. Similarly, Res (h, 1/2) = 

i-(l+^(l/2)/2). 
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Now, we let n goes to infinity. Inequality ( |2.8| ) implies 

lim / h(z)dz = lim / h(z)dz = 0. 

n ~^°° JA n B n 

Indeed, it follows from the parametrization of A n B n of the type z = in + t, inequality 
| in + t\ ^ n + 1 valid for 0 < t < £ + 1/2, and 

| sin(47r(in + t)) | 2 = sinh(47rn) 2 cos(47rf) 2 + cosh(47m) 2 sin(47rt) 2 
^ min {sinh(47m) 2 , cosh(47m) 2 } . 

We proceed analogously on C n D n . As a consequence, letting n —> oo in ( |2.14| |, we get 


In — — 8i ■ 


/ h(z)dz — 2iir 
d ^l/2 + e 


-^(l+V(l/4)/4) + ^(l + ^(l/2)/2) 


= — 8i / h(z)dz + ip (1/4) — 2ip (1/2). 

^l/2+e 


□ 


Bringing together Lemmas |2.4||2.6| leads to 

1 


Pc = —8 i I -T- 7 Z —T ( + 1) - {z - -)ip{z) )dz + ip (1/4) - 2 i/> (1/2). 

a 1/2+e sm (4 ttz) \ 2 


Setting z = 1/2 + u and using identity ( |2.12| l with u + 1/2 instead of 2 gives: 

p c = —8 i j h + {u)du + %l> (1/4) — 2i/j (1/2), (2.15) 

J A e 

where h ± (z) := —-—r (l + ^(^ ± z)). 

sin (47rz) \ 2 2 J 

We are not able to calculate directly / h + (u)du, however it is possible for / h + (u)du± 

J a. J A, 


h ( u)du. 


>a e 


Lemma 2.7. For any e e]0, l/4[. 


/ h + (u)du+ / h ( u)du = - (1 + ^>(l/2)/2) 

' a e Ja e 2 

7r f tan(7rz) 


/ h + (u)du — / h ( u)du = — 

Ja e Ja e 2 Jn 


A sin(47rz) 


dz. 


(2.16) 

(2.17) 


Proof. 1) We begin proving l |2.16| l. The function /t + is meromorphic in {z € C; —1/4 < f?ez < 1/4} 
with unique pole z = 0. Then the residue theorem yields 

j h + (z)dz= f h + (z)dz + 2iirRes (h + ,0) = — I h~(z)dz + - (1 + ip(l/2)/2) ■ 

J A e J A_ e J A e 2 

2) Formula \2.11\ is a direct consequence of formula 11 p!6 in l}6l: (I + z)-4> (5 - z ) = 
7rtan(7rz). O 


It is easy to deduce from ( |2.16| l and fl2.17| l that: 
2 


L h+mz =5 L SpS) dz + 5 (1+ * (1/2,/2) ■ 
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Relation ( ]2 .1 5[ > implies directly that p c equals -0 (1/4) — ip (1/2) + 2 — 2n ia(e), where 
tan(7rz) 


a(e) := f 
J/s 


leads to: 


A e sin(47rz) 


dz. Since the real number p c does not depend on e, letting £ —► 0 


where a(0+) = / 
Ja 


p c = 'ip (1/4) — ip (1/2) + 2 — 2nia(0+) 
tan(7 tz) 


Ao sin(47rz) 
a(0+) = i 


dz. This integral is easy to calculate. 

dx 


tanh(7ra:) i 

-dx = 


Jr sinh(47ra:) 4 J K cosh 2 (7r:r) cosh(27rx) 

We make the change of variable u = tanh(7nr): 

nl du 


i(0+) — 


2n 


1 — v? i 

TT^ d “ = S 


-1 + 2 


1 + u- 


i 

2n 


Theorem 1 1.1 1 follows from H2.18P and the above result. 


(2.18) 


(-W) 
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